Online Learning Mechanisms for Bayesian Models of Word Segmentation

نویسندگان

  • Lisa Pearl
  • Sharon Goldwater
  • Mark Steyvers
چکیده

In recent years, Bayesian models have become increasingly popular as a way of understanding human cognition. Ideal learner Bayesian models assume that cognition can be usefully understood as optimal behavior under uncertainty, a hypothesis that has been supported by a number of modeling studies across various domains (e.g., Griffiths and Tenenbaum, Cognitive Psychology, 51, 354–384, 2005; Xu and Tenenbaum, Psychological Review, 114, 245–272, 2007). The models in these studies aim to explain why humans behave as they do given the task and data they encounter, but typically avoid some questions addressed by more traditional psychological models, such as how the observed behavior is produced given constraints on memory and processing. Here, we use the task of word segmentation as a case study for investigating these questions within a Bayesian framework.We consider some limitations of the infant learner, and develop several online learning algorithms that take these limitations into account. Each algorithm can be viewed as a different method of approximating the same ideal learner. When tested on corpora of English child-directed speech, we find that the constrained learner’s behavior depends non-trivially on how the learner’s limitations are implemented. Interestingly, sometimes biases that are helpful to an ideal learner hinder a constrained learner, and in a few cases, constrained learners perform equivalently or better than the ideal learner. This suggests that the transition from a computational-level solution for acquisition to an algorithmic-level one is not straightforward. L. Pearl (B) · M. Steyvers Department of Cognitive Sciences, University of California, 3151 Social Science Plaza, Irvine, CA 92697-5100, USA e-mail: [email protected] S. Goldwater School of Informatics, University of Edinburgh, Edinburgh, UK

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Mechanisms for Bayesian Models of Word Segmentation

In recent years, Bayesian models have become increasingly popular as a way of understanding human cognition. Ideal learner Bayesian models assume that cognition can be usefully understood as optimal behavior under uncertainty, a hypothesis that has been supported by a number of modeling studies across various domains (e.g., Griffiths & Tenenbaum, 2005; Xu & Tenenbaum, 2007). The models in these...

متن کامل

The Effect of Online Learning Tools on L2 Reading Comprehension and Vocabulary Learning

The aim of this study was to investigate the effects of various online techniques (word reference, media, and vocabulary games) on reading comprehension as well as vocabulary comprehension and production. For this purpose, 60 language learners were selected and divided into three groups, and each group was randomly assigned to one of the treatment conditions. In the first session of tre...

متن کامل

Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars

One of the reasons nonparametric Bayesian inference is attracting attention in computational linguistics is because it provides a principled way of learning the units of generalization together with their probabilities. Adaptor grammars are a framework for defining a variety of hierarchical nonparametric Bayesian models. This paper investigates some of the choices that arise in formulating adap...

متن کامل

Modeling online word segmentation performance in structured artificial languages

Lexical dependencies abound in natural language: words tend to follow particular words or word categories. However, artificial language learning experiments exploring word segmentation have so far lacked such structure. In the present study, we explore whether simple inter-word dependencies influence the word segmentation performance of adult learners. We use a continuous testing paradigm inste...

متن کامل

How Ideal Are We? Incorporating Human Limitations into Bayesian Models of Word Segmentation

1. Introduction Word segmentation is one of the first problems infants must solve during language acquisition, where words must be identified in fluent speech. A number of weak cues to word boundaries are present in fluent speech, and there is evidence that infants are able to use many of these, including phonotactics However, with the exception of the last cue, all these cues are language-depe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009